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The only theory of correlation at present available for practical 
use is based on the normal law of frequency, but, unfortunately, this 
law is not valid in a great many cases which are both common and 
important. It does not hold good, to take examples from biology, 
for statistics of fertility in man, for measurements on flowers^ or for 
weight measurements even on adults. In economic statistics, on the 
other hand, normal distributions appear to be highly exceptional : 
variation of wages, prices, valuations, pauperism, and so forth, are 
always skew. In cases like these we have at present no means of 
measuring the correlation by one or more " correlation coefficients " 
such as are afforded by the normal theory. 

It seems worth while noting, under these circumstances, that in 
ordinary practice statisticians never concern themselves with the 
form of the correlation, normal or otherwise, but yet obtain results of 
interest — though always lacking in numerical exactness and fre- 
quently in certainty. Suppose the case to be one ia which two 
variables are varying together in time, curves are drawn exhibiting 
the history of the two. If these two curves appear, generally 
speaking, to rise and fall together, the variables are held to be corre- 
lated. If on the other hand it is not a case of variation w4th time, 
the associated pairs may be tabulated in order according to the 
magnitude of one variable, and then it may be seen whether the 
entries of the other variable also occur in order. Both methods are 
of course very rough, and will only indicate very close correlation, 
but they contain, it seems to me, the point of prime importance at 
all events with regard to economic statistics. In all the classical 
examples of statistical correlation (e.^., marriage-rate and imports, 
corn prices and vagrancy, out-relief and wages) we are only 
primarily concerned with the question is a large a? usually associated 
with a large y (or small y) ; the further question as to the form of 
this association and the relative frequency of different pairs of the 
variables is, at any rate on a first investigation, of comparatively 
secondary importance. 

. Let Ox, Oy be the axes of a three dimensional frequency-surface 
drawn through the mean of the surface parallel to the axes of 
measurement, and let the points marked (x) be the means of succes- 
sive a3-arrays, lying on some curve that may be called the curve of 
regression of x on y. ISTow let a line, RR, be fitted to this curve, 



478 Mr, G. U. Yule. On the Significance of Bravais Fovmnlw 




.'>'»■ 



subjectiBg the distances of tlie means from tlie line to some minimal 
condition. If the slope of BE is positire we may say tliat large 
values of x are on tlie whole associated with large values of y, if it is 
negative large values of x are associated with small values of y. 
Further, if the slope of E,R to the vertical he given we shall have a 
measure of a rough practical kind of tbe shift of the mean of an 
a^array when its type y is altered. The equation to RH conse- 
quently gives a concise and definite answer to two most important 
statistical questions. It is also evident that if the means of the 
arrays actually lie in a straight line (as in normal correlation), the 
equation to BR must be the equation to the line of regression. 

Let n be the number of observations in any e'i^-array, and let d be 
the horizontal distance of the mean of this array from the line RR. 
I propose to subject the line to the condition that the sum of all 
quantities like n(P shall be a minimum, i.e,^ I shall use the condition 
of least squares. I do this solely for convenience of analysis; I do 
not claim for the method adopted any peculiar advantage as regards 
the probability of its results. It would, in fact, be absurd to do so^ 
for I am postulating at the very outset that the curve of regression is 
only exceptionally a straight line ; there can consequently be no 
meaning in seeking for the most probable straight line to represent 
the regression. 

Let 03, 1/ be a pair of associated deviations, let a be the standard 
deviation of any array about its own mean, and let 
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be the equation to RR. Then for any one array 

Hence, extending the meaning oi: S to summation over the whole* 
surface 

But in this expression ^{n(r) is independent of a and h, it is, in fact,, 
a characteristic of the surface. Therefore, making ^(jkP) a minimum 
is equivalent to making 

a minimum. That is to say, we may regard our method in another 
h'ght. We may say that we form a single-vahied relation 

X = a-\-hy 

between a pair of associated deviations, such that the sum of the 
squares of our errors in estimating any one x from its y by the 
relation is a minimum. This single-valued relation, which we may 
call the characteristic relation, is simply the equation to the line of 
regression RR. There will be two such equations to be formed 
corresponding to the two lines of regression. 

The idea of the method may at once be extended to the case of 
correlation between several variables a?i, x-i^ x-^^ &c. Let n be the 
nu.mber of observations in an array of XiS associated with fixed 
values X2, X3, X4, &c., of the remaining variables, let o-^ be the 
standard deviation of this array, and let d be the difference of its 
mean from the value given by a regression equation 

Xx = ai'^2-]r aii2L2,-{- ciiiKj^-\- ...... 

Then, as before, we shall determine the coefficients ai2, ^13 a^, &c., so 
as to make ^nd? a minim am. But this is again equivalent to 
making 

a minimum for 

^{xi — {ciiiXi + cii^oc-i + <^i4a'4 4- . . • • ) }^ ~ ^ O^^^iO + S (iid^) . 

Hence, we may say that we solve for a single- valued relation 

Xi = aiiX-z + cbi^^z + citi4^4 -h . • . • 

between our variables; the relation being such that the sum of the 
squares of the errors made in estimating X\ from its associated 
values £C2, a^a, &c., is the least possible, hi the case of normal cori'ela- 
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tion this " characteristic relation " must become the " equation of 
regression '* which gives the means of any a^i-array, as only in this 
way can ^nd? be made a minimum, i.e., zero. 

It might be said tha.t it would be more natural to form a '' charac- 
teristic relation " between the absolute values of the variables and 
not their deviations from the mean. This may, however, be most 
conveniently done by working with the mean as origin until the 
characteristic is obtained, and then transferring the equation to zero 
as origin. It would be much more laborious and would only lead to 
the same result if zero were used ah initio as origin. 

We may now proceed to the discussion of the special cases of two, 
three, or more variables. The actual formulee obtained are not, it 
will be found, novel in themselves, but throw an unexpected light 
on the meaning of the expressions previously given by Bravais"^ for 
the case of normal correlation, 

(1) Gase of T%oo Variables, — Since x and y represent deviations 
from their respective means, we have, using S to denote summation 
over the whole surface. 

The characteristic or regression equations which we have to find are 
of the form 

, 7 r (1). 

Taking the equation for x first, the normal equations for ai and ^i 
are 

Biixy) =:aS{y) + hS(y') i ' ^^' 

iST being the total number of correlated pairs. From the first of 
these equations we have at once 

ai — 0. 
From the second 

-. __ ^(xy) 

Oi — -_- — ._. . 

To simplify our notation let us write 



S(xy) = 'Nr(7i(r2. 

^i and 0-2 are then the two standard-deviations or errors of mean 

# '( Memoii-es par divers Savants," 1846, p. 255, and Professor Pearson's paper 
on " Regression, Heredity, &c." ' Phil. Trans.,' A, vol. 187 (1896), p. 261 et seq. 
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square, r is Bravais' value of tlie coefficient of correlation. Re- 
writing hi in terms of these symbols, we have 

&i = r^.... (3). 

Similarly, Og = 0, feg = r -^ ., •..,,,,.•.. • (4) . 

But the expressions on the right of (3) and (4) are the values 
obtained by Bravais on the assumption of normal correlation for the 
regression of x on y, and the regression of y on (k. That is to say, 
the Bravais values for the regressions are simply those values of h 
and 2)2, which make 

S(aJ— &i2/)^ and {^(m-^h^yy 

respectively minima, %ohatever he the form of the correlation hetween the 
two variables* Again, whatever the form of the correlation, if the 
regression be really linear, the equations to the lines of regression are 
those given above (as we pointed out in the introduction). This 
theorem admits of a very simple and direct geometrical proof. 

Let n be the number of correlated pairs in any one array taken 
parallel to the axis of £c, and let be the angle that the line of 
regression makes with the axis of y. Then, for a single array, 

B{xy) = 2/S(aj) = ny^ tan ^, 

or extending the significance of S to summation over the whole 
surface, 

S(aji/) = jN* tan 0(Ti^ 

that is, 

tan ^ = r ~ . 

In any case^ then, cohere the regression appears to he linear , Bravais* 
formulm may he used at once without troubling to investigate the 
normality of the distribution* The exponential character of the surface 
appears to have nothing whatever to do with the result. 

To return, again, to the most general case, we see that both 
coefficients of regression must have the same sign, namely, the sign 
of r. Hence, either regression will serve to indicate whether there is 
correlation or no, for there is no reason, a priori, why the values of 
hi and &2> as determined above, should be positive rather than 
negative. But, nevertheless, the regressions are not convenient 
measures of correlation, for, on comparing two similar cases, we may 
find, say, 

hi > h'u hi < 6'2, 

VOL. LX. 2 O 
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where hfi'i, h'ib'2 are tlie regressions in tlie two cases. To wliicli 
distribution are we, in such a case, to attribute the greater corre- 
lation ? Bravais' coefficient solves the difficulty, we may say, in 
one way, by taking the geometrical mean of the two regressions as 
the measure of correlation. It will still remain valid for non-normal 
correlation. But there are other and less arbitrary interpretations 
even in the general case. 

Suppose that instead of measuring x and y in arbitrary units we 
measure each in terms of its own standard deviation. Then let us 
write 

— = /> ^ \D), 

and solve for p by the method of least squares. We have omitted a 
constant on the right-hand side, since it would vanish as before. We 
have, at once, 

That is to say, if we measure x and y each in terms of its own 
standard deviation, r becomes at once the regression of x on y, and 
the regression of y on x. The regressions being, in fact, the funda- 
mental physical quantities, r is a coefficient of correlation because it 
is a coefficient of regression.* 

Again, let us form the sums of the squares of residuals in equations 
(1) and (5). Inserting the values of hi, b^, and />, we have — 







^. ••••.«• V'y* 



Any one of these quantities, being the sum of a series of squares, 
must be positive. Hence r cannot be greater than unity. If r be 
equal to unity, or if the correlation be perfect, all the above three 
sums become zero. But 

can only vanish if 

X If 

in every case, or if the relation hold good, 

* That the regression becomes the coefiieienfc of correlation when eacH deviation 
is measured in terms of its standard -deviation in the case of normal correlation has 
been pointed out by Mr. Francis Galton. Vide Pearson * Phil. Trans.,* A, vol. 187, 
p. 307, note. 
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^1 _ % '^3 __ »«. O- ^'^ f9C\ 

, — <i;;^ ._^ — . ■ >^ • • • • •"— 3S ■-""■ -••»•••»•••«* ^O Jj 

2/1 2/2 2/3 <^2 

the sign of tlie last term depending on tbe sign of r. Hence tlie 
statement tliat two variables are " perfectly correlated " implies that 
relation (8) holds good, or that all pairs of deviations bear the same 
ratio to one another. It follows that in correlation, where the means 
of arrays are not collinear, or the dcTiation of the mean of the array 
is not a linear function of the deviation of the type, r can never be 
unity, though we know from experience that it can approach pretty 
closely to that value. If the regression be very far from linear, some 
caution must evidently be used in employing r to compare two diffe- 
rent distributions. 

In the case of normal correlation, ciV^l—r^ is the standard devia- 
tion of any array of the .x variables, corresponding to a single type of 
2/'s. fl-gV'i—r^ is similarly the standard deviation of any array of 
the y variables, corresponding to a single type of oj's. In the general 
case, the first expression may be intei^preted as the mean standard 
•deviation of the aj-arrays from the line of regression, and the second 
expression as the mean standard deviation of the |/ -arrays from the 
line of regression. Otherwise we may i^egard 



as the standard error made in estimating x from the relation 

X = hy, 
and 



as the standard error made in estimating y from the relatioti 

y = h^x, 

these interpretations being independent of the form of the correla- 
tion. 

(2.) Oase of Three Variahles, 

Let the three correlated variables be Xi, X2, X3, and let o^j, ajg, x^ 
denote deviations of these variables from their respective means. Let 
us write, for brevity, 

8(0?!%) = N'n2<yio^2 

B(X2Xz) = jN"r230-8ff3 

2 2 
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Oar cliaracterlstic or regression- equation will now be of tlie form 

a?i = &i2a:3-|-&i3a?3 C^)? 

6i3 and 5i3 being the unknowns to be determined from the observations 
by the method of least squares. I have omitted a constant term on 
the right-hand side, since its least-square value would be zero as 
before. The two normal equations are now— 

or replacing the sums by the symbols defined above, and simplify- 

r^^ax = &120-2+ 613^230-31 ^-•/^^ 

= Ol2?^230-2 + t)i30-3 J 



mg 



f •••«•.••••••• (X-LJ( 



^30-1 

whence 

ri2~-ri3r23 f^x 

1—^23^ 0-2 

ri3— risras o-i 

1—^23 0-3 

That is, the characteristic relation between ajj and x^x-i is — 

_n2~-n3^23 0-1 ^ n3— ri2r23 0-i ^ ,,^. 

1— V 0-2 1—^23^ 0-3 

Now Bravais showed that if the correlation were normal, and we 
selected a group or array of Xi's with regard to special values /i2 and 
7^3 of x^ and a?3, then hi being the deviation of the mean of the selected 
Xi's from the Xj-mean of the whole material, 

hi = hizhz + hiJh, 

where &12 and 613 have the values given in (11). But evidently the 
3:!elation is of much greater generality ; it holds good so long as hi is 
a linear function of h^ and ^3, whatever he the law of frequency. 

Further, the values of b^ and &13 above determined, are, under any 
circumstances, such that 

Sv^ = S [xi — (bi2X2 -f hizX'i) ] -, 

is a minimum. If we insert in this expression the values of h^ and 
5i3 from (11), we have, after some reduction, 

= ISTo-i^ji-Ri-^} (13), 
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say, In normal correlation <ri%/l— Ei^ is the standard deviation of 
an Xi-array, corresponding to aoy given types of X2 and X3. In 
general correlation it may be regarded as the mean standard deviation 
of the Xi-arrays from the plane 

or as the standard error made in estimating Xi from % and % by 
relation (12). 

The quantity B is of some interest, as it exactly takes the place of 
r in the residaal expressions (7). Bj may, in fact, be regarded as a 
coefficient of correlation between Xi and {X2Xs) ; it can only be nnity 
if the linear relation (9) or (12) hold good in every case. 

The quantities hn, h^, &c. (the others may be written down by 
symmetry), may be tex-med the net regressions of Xi on ^2, i»i on a?3, 
&c. If we write 2 for 1 and 1 for 2 in the value of feig, we have 

7, — '^12 — ^13**23 % 

O21 --— 5 , 

1 — ^23^ <^l 

bn being the the net regression of % on Xi. In normal correlation, 
hi2 and b^i are the regressions for any group of Xi's or Xs's associated 
with a fixed type of Xg's. Hence, in this case (normal correlation), 
the coefficient of correlation for such a group is the geometrical mean 
of the two regressions, or 



a quantity that may be called the net coefficient of correlation 
between a^i and %.* The similar net coefficients between Xi and ^'3, 
Xz and x^, may be written down by interchanging the suffixes. 

In normal correlation pn is quite strictly the coefficient of correla- 
tion for any sub-group of Xi*s and Xg's, whatever the associated type 
of Xg's. In generalised correlation this will not be so, and pi^ can 
only retain an average significance. 

The method does not appear to be capable of investigating changes 
in the net coefficient as we pass from one type to another, but it may 
be noted that whatever the form of the correlation, pi2 retains three 
of the chief properties of the ordinary coefficients : (1) it can only be 

* My quantities, Sjgj djg, &c., were termed by Professor Pearson (" Eegression 
&c.," * Phil. Trans./ A, toI. 187 (1896), p. 287), "Coefficients of double regression/' 

and quantities like ^12-^, Ji3^?, &c., " coefficients of double correlation." My 

quantities p he did not use. Having named the p*s '' net correlation," it seemed 
most natural to rename the h's ** net regressions," as the 5's and p's are correspond* 
ing quantities. 

Some of my results given above were quoted by Professor Pearson in his paper 
(he, eit.j notes on pp. 268 and 287). 
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zero if botli net regressions are zero ; (2) it is a symmetrical f mic- 
tion of the variables; (3) it cannot be greater than unit j; for, 
1>J (13), 

or adding ^13^23^ to both sides, and transferring Vi^^ to the rigbt-band 
side 

(rn—rr^Ti^y < (l--ri3^)(l~r23^). 

If any two coefficients, say ThO^z, be supposed known, tbe inequality 
we bave used above will give us limits for tbe value of tbe tbird. 
Tbrowing it into fcbe form 

(r2z^ri2ri2y < l + rioVia^— 9'i2^— r^^ 

we bave r23 must lie between tbe limits 



Tbe values of tbese limits for some special cases are collected in 
tbe following table :— 



Yalues of r^o and ^13. 
ri2 = ri3 = 
ri2 = ri3 = +1 

no = 0, ri3=: ±1 
ri3 = 0, ri3 = ±r 

Tn = 4-^, ria = —r 
ri2 = ri3 = + \/0-5 = 



0'707 



V^O'5 



Limits of r23, 



+ 1 
_1 





1 and 2r^— -1 
2r^-~l and — 1 

and 1 

„ -1 



One is rather prone to argue tbat if A be correlated with B, and B 
with C, A will be correlated with 0. Evidently this is not necessary. 
A may be positively correlated with B, and B positively correlated 
with 0, but yet A may, in general, be negatively correlated with C. 
Only, if the coefficients (AB) and (BO) are both numerically greater 
than 0*707, can one even ascribe tbe correct sign to the (AC) corre- 
lation. 

It is evident that one would, in general, expect to make a^ smaller 
standard error in estimating Xi from the two associated variables x^ 
and a'3, than in estimating it from one only, say x^. But it seems 
desirable to prove this specifically, and to investigate under what 
conditions it will hold good. Tbe necessary condition is— 

n2^+n3^—2n2^'23n3 ^ ' 2 

;> 7-^3 J 
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that is, 



or 



0*13 — ^12^23)^ > 0, 



But (ri3— ri2r23) is the numerator of pi^, the net coefficient of corre- 
lation between Xi and 033. Hence the standard error in the second 
case will be always less than in the first, so long as />i3 is not zero. 
The condition is somewhat interesting. 

To take an arithmetical example, suppose one had in some actual 
case 

ri3= +0*8 

r23= -fO'5 ri3= +0-4. 

One might very naturally imagine that the introduction of the third 
variable with a fairly high correlation coefficient (0*4) would con- 
siderably lessen the standard deviation of the £^i-array ; but this is 
not so, for 

0-4 — (0-5 X 0-8) 

^^^~ -v/O-75 X 0^' * ' 
so the third variable would be of no assistance. 



III. Case of Four VarioMes. 

This case is, perhaps, of sufficient practical importance to warrant 
our developing the results at length as in the last. 

If Xi, a?2, a?3, a?4, be the associated deviations of the four variables 
from their respective means, the characteristic equation will be of the 
form 

«^i = K^i-\rhz^i-^hi:^i (14). 

The normal equations for the 6's are, in our previous notation, 

^130-1=: hi2r2i(T2-\-hi^<Ti-i'huru(yi > 

ruCTi = i^l2^'24'72 4- 613^340-3 + ^^4*^4 J 



Hence 



b 



12 





rn 


^3 




ns 


1 




ru 


r-ai 




1 


riz 




^23 


1 




^24 


^34 



^24 
1 



^24 
7^34 
1 






(15). 



and so on for the others, &12, ^13, &c., we may call the net regressions 
of Xi on x-i'i Xi on Xs, &c., as before. By pa^rity of notation, we have 
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621 = 



Til 


rn 


r^i 


rn 


1 


ru 


ru 


rsi 


1 


1 


rn 


ru 


riz 


1 


ru 


ru 


ru 


1 



0-2 



and we may again call 



P12 = Vbi^b 



21j 



Pl2 



the net coefficient of correlation between Xi and %» Expanding the 
determinants, we have, in fact, 



^[(1-^34^) +^23(^34*\>4-^23) +^24('^'23**34--?'24)][(l -^34^) +^3(^34^4-^3) + n4{n3*'34 -^4)] 

«•««•••• ^.10 I • 

There are six such net coefficients, />i2, pi^, pu, /sgs, pzh Pu' The 
above values of the regressions are again those usually obtained on 
the assumption of normal correlation.* The net correlation pn 
becomes, on that assumption, tbe coefficient of correlation for any 
group of the &h 02 variables associated with fixed types of x^ and soi. 
If we write 

U = a^'i — (5i2a?2+5l3^3 + &l4^4), 

we have, after some rather lengthy reduction, 



where 



AS(«0 = -^1^(1-11.^), 



2 _L A-.. .2_«./>. 2,« 2 — /I. 2^« 2__^_2;i»^ 2 



Bx^ 



I - Hn^^U'^U + ^*12^^14^24 + ^12^13^'23 ^ + ^ 0*12^14*^23^ ^4 + ^3^14^23^24 + ^ ^12^13'^-24n^4) J * 



In normal correlation, o-iv^l— -Hi^ is the. standard deviation of all a?i- 
arrays associated with fixed types of .%, x^, and X4. In general corre- 
lation, it is most easily interpreted as the standard error made in 
estimating a?i, by equation (14), from its associated values of Xz^ a.'3, 

and. c^4* 

As in the case of three variables, the quantity B may be considered 
as a coefficient of correlation. It can range between +1, and can 
only become unity if the linear relation (14) hold good in each indi- 
vidual instance. 

We showed at the end of the last section that the standard error 
made in estimating Xi from the relation 

Xi = hnX2 + 013X3 

*' Professor Pearson, ''Begression, Heredity, and Panmixia." * Phil. Trans.,' 
A, vol. 187 (1896), p. 294. 
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was always less than tlie standard error wlien only x^ was taken into 
aoconnt, unless 

We may now prove the similar theorem that when we use three 
variables, %, a?3, a?4, on which to base the estimate, the standard error 
will be again decreased, unless 

pu = 0. 

The condition that S(u^), in our present case, shall be less than 
S(r^) in the last, is, in fact, 

I +2(ri2run^ni+rurizr2ir2i+rnri^r2iru) J 
>(n2Hri3^~~2ri2ri3r33)(l~r33^'---r24'^---r34H2r23r2#34). 

This may be finally reduced to — 

that is pi^ > 0. 

The treatment of the general case of n variables, so far as regards 
obtaining the regressions, is obvious, and it is unnecessary to give it 
at length. 

We can now see that the use of normal regression formula is quite 
legitimate in all cases, so long as the necessary limitations of inter- 
pretation are recognised. Bravais' r always remains a coefficient of 
correlation. These results 1 must plead as justification for my use of 
normal formula in two oases* where the correlation was markedly 
non-normal. 



'^ Mathematical Contributions to the Theory of Evolution.— On 
a Form, of Spmioiis Correlation which may arise when 
Indices are used in the Measurement of Organs." By 
Karl Pjbarson, F.E.B., University College, London. Re- 
ceived December 29, 1896,— Read February 18, 1897. 

(1) If the ratio of two absolute measurements on the same or 
different organs be taken it is convenient to term this ratio an index. 

If w = fi{m^ y) and v =/3(^, y) be two functions of the three variables 
a?, 2^5 0, and these variables be selected at random so that there exists 
no correlation between a?,^/, y,z, or 0,i», there will still be found to 

* ' Economic Journal,' Dec, 1895, and Dec, 1896, " On tlie Correlation of Total 
Pauperism with Proj)ortion of Out-relief." 



